A well-tested parser for parsing metadata out of fenced code blocks in Markdown.
Overview
Assuming you have this code fence in your Markdown,
```ts twoslash {1-3, 5} title="Hello, World"
Using remark will yield two information about that code block, lang
and meta
like this.
{
"lang": "ts",
"meta": "twoslash {1-3, 5} title=\"Hello, World\""
}
Use fenceparser
to parse the meta
string out to a useful object.
import parse from 'fenceparser'
console.log(parse(meta))
The parser won't intentionally handle parsing the language part since it is usually handled by the Markdown parsers.
But if you want to allow loose syntax grammars such as ts{1-3, 5}
as well as ts {1-3, 5}
which is used by gatsby-remark-vscode as an example, remark won't parse the language correctly.
{
"lang": "ts{1-3,", // because remark uses space to split
"meta": "5}"
}
In these cases, you can use the the library's lex
function to get a properly tokenized array. You may then take out the first element as lang
. For example,
import { lex, parse } from 'fenceparser'
const full = [node.lang, node.meta].join(' ')
const tokens = lex(full)
const lang = tokens.shift()
const meta = parse(tokens)
Syntax
The syntax grammar is loosely based on techniques used by various syntax-highlighters. Rules are such that
- Valid HTML attributes can be used,
attribute
, data-attribute
, etc. - Attributes without values are assigned as
true
- Attribute values can be single or double quoted strings, int/float numbers, booleans, objects or arrays
- Non-quoted strings are valid as long as they are not separated by a whitespace or a line-break,
attr=--theme-color
- Objects can accept valid attributes as children, or valid attributes with value assigned by
:
keyword, {1-3, 5, ids: {7}}
- Arrays are just like JavaScript's arrays
- Objects without attribute keys
{1-3} {7}
are merged and assigned to the highlight
object - No trailing commas
Acknowledgements
- This project is made initially to use with Twoslash.
- The
Lexer
and Parser
are based on the examples from the book Crafting Interpreters.